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Evidence to support stimulus-stimulus pairing (SSP) in speech acquisition is less than robust, 
calling into question the ability of SSP to reliably establish automatically reinforcing properties 
of speech and limiting the procedure’s clinical utility for increasing vocalizations. We evaluated 
the effects of a modified SSP procedure on low-frequency within-session vocalizations that were 
further strengthened through programmed reinforcement. Procedural modifications (e.g., 
interspersed paired and unpaired trials) were designed to increase stimulus salience during SSP. 
All 3 participants, preschoolers with autism, showed differential increases of target over nontarget 
vocal responses during SSP. Results suggested an automatic reinforcement effect of SSP, 
although alternative interpretations are discussed, and suggestions are made for future research to 
determine the utility of SSP as a clinical intervention for speech-delayed children. 
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Children with delayed speech have an 
instructional advantage if they emit frequent 
and varied vocal play and can repeat, even 
imprecisely, what they hear. Such fledgling 
speech can be shaped into accurate, complex 
topographies (e.g., Eikeseth & Nesset, 2003; 
Johnston & Johnston, 1972) that, taken 
together, characterize language when functional 
relations of these topographies and their 
controlling variables are established (see Sautter 
& LeBlanc, 2006). However, individuals who 
rarely vocalize or who do not readily imitate 
speech models have fewer opportunities to 
benefit from speech instruction because little 
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behavior may be available for modification 
through reinforcement by a verbal community. 

To create a larger pool of available vocal 
behavior that can come under appropriate 
stimulus control, researchers have recently 
begun to investigate the application of stimu- 
lus-stimulus pairing (SSP), an approach with 
wide support in the basic behavioral literature 
(see Williams, 2002). Aimed at increasing the 
conditioned reinforcing value of speech sounds 
(i.e., auditory stimuli as vocal response prod- 
ucts), SSP is based on the rationale that early 
human vocal activity may develop, at least in 
part, from automatic reinforcement (Ahearn, 
Clark, MacDonald, & Chung, 2007; Vaughan 
& Michael, 1982) related to stimuli generated 
through speech behavior itself. In other words, 
early speech attempts may be self-strengthening 
in that these responses produce sounds that have 
value as reinforcers. 

Systematically pairing a stimulus of weaker 
value with already-effective reinforcers (Catania, 
1998) establishes the requisite history of 
contiguity (Michael, 2004) to condition it as a 
reinforcing stimulus in its own right. In the case 
of early speech acquisition, caregivers likely 
condition auditory stimuli by repeatedly pairing 
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their own vocalizations with the delivery of 
important events (e.g., feeding, rocking). If a 
child subsequently produces similar sounds, 
these stimuli can function as automatic rein- 
forcers for movements that produce them, thus 
allowing particular topographies to be selected 
into the speech repertoire (Bijou & Baer, 1965; 
Schlinger, 1995; Skinner, 1957). They then 
undergo further shaping into more complex 
syllabic units through social (Skinner) and 
automatic (Palmer, 1996) contingencies of 
reinforcement. 

For children who engage in varied and 
frequent vocal play, the assumption is that the 
sounds produced have some value in the absolute 
sense (i.e., they sound good). Thus, early speech 
may be strengthened to some degree by its 
automatic consequences, irrespective of pro- 
grammed (i.e., socially mediated) contingencies 
that also may operate on these vocalizations. 
However, in children with speech delays, 
auditory speech stimuli may not function as 
reinforcers for vocal behavior, as evidenced by a 
weak repertoire of few or inconsistent responses 
that result in such stimuli. 

Applied studies using SSP to improve delayed 
speech have shown that, when an arbitrary syllable 
spoken by an adult is paired with a preferred 
stimulus, 1 children often subsequently emit the 
paired syllable, suggesting a procedural condi- 
tioning effect on the automatic reinforcing value 
of these auditory stimuli. Sundberg, Michael, 
Partington, and Sundberg (1996) paired novel 
syllables, words, or short phrases with known 
reinforcers for 4 preschoolers with severe to 
moderate language delays and 1 typically devel- 
oping child. After brief pairings (e.g., 15 pairings 
per minute for a few minutes), all children 
emitted the novel vocal responses, although effects 

1 In SSP studies of speech acquisition, speech stimuli are 
paired with preferred stimuli that may (or may not) have 
been shown to function as reinforcers for other responses. 
However, it should be noted that during SSP, no response 
is necessary for this “reinforcer” delivery to occur. Thus, 
the preferred stimulus does not function to strengthen a 
response but rather, through repeated pairings, to 
condition the speech stimulus as a reinforcer. 


dissipated rapidly and not all pairing periods 
resulted in increased target responding. 

In addition to influencing novel responses, 
SSP also has been shown to increase vocalizations 
that already exist in the repertoire. Smith, 
Michael, and Sundberg (1996) demonstrated 
postpairing increases with 2 typically developing 
infants (less than 18 months old) when acquired 
syllables were paired with reinforcer delivery. To 
address the possibility that observed increases in 
responding might be attributable to already- 
established echoic control, Smith et al. included a 
neutral condition in which the auditory stimulus 
was presented without reinforcer delivery. Be- 
cause the target sound was not emitted, the 
authors concluded that increased vocalizations 
following positive reinforcer pairings were not 
under echoic control but, instead, were auto- 
matically reinforced. 

Yoon and Bennett (2000) reported increased 
postpairing vocalizations with 4 speech-delayed 
preschoolers when speech sounds were paired 
with tickles for 3 min (12 pairings per minute) 
but, again, effects were temporary (3 to 
16 min). Target responses were observed (with 
one exception) only after SSP and not when 
echoic contingences were in effect, thus under- 
scoring Smith et al. (1996) in ruling out echoic 
control as a possible explanation for the results. 
Showing less robust and more variable SSP 
effects but using a more rigorous experimental 
design, Miguel, Carr, and Michael (2002) 
observed increased postpairing vocalizations 
across sessions in 2 of 3 speech-delayed 
preschool boys on at least one target syllable 
after establishing a pairing history of speech 
sounds with candy delivery. 

In these studies and in others reporting 
absent or discrepant SSP effects (e.g., Esch, 
Carr, & Michael, 2005; Normand & Knoll, 
2006; Stock, Schulze, & Mirenda, 2008; Yoon 
& Feliciano, 2007), variables responsible for 
these inconsistencies have not been delineated. 
Differential responding may be related to level 
of preexisting language skills. Yoon and Bennett 
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(2000), for example, reported greater postpair- 
ing increases by a participant with a relatively 
stronger preintervention vocal repertoire. By 
contrast, Miguel et al. (2002) reported that 
their participant with the strongest preexisting 
language skills demonstrated fewer increases in 
vocalizations following pairing. It may be that, 
for some children with more advanced language 
skills (e.g., mands, intraverbals; Skinner, 1957), 
reinforcement available through verbal interac- 
tions with others, as well as that provided by 
achieving parity (Palmer, 1996) with the 
linguistic practices of one’s verbal community, 
may supersede the effects of the SSP procedure 
on vocal play. However, Sundberg et al. (1996) 
observed postpairing increases in vocalizations 
by children with both strong and weak 
preintervention repertoires, whereas Esch et al. 
reported no vocalization increases after provid- 
ing an extensive pairing history for 3 children (6 
to 8 years old) with little preexisting vocal 
verbal behavior. Similarly, Normand and Knoll 
reported null SSP findings with a 3-year-old 
boy whose preintervention repertoire contained 
several vocal mands and tacts. Collectively, 
these results suggest that failure or success of 
SSP to produce effects cannot be attributed 
solely to idiosyncratic characteristics of existing 
language skills. 

It is likely that factors related to conditioning 
procedures influence this process more directly, 
although no SSP studies to date have focused on 
specific variables (or their modification) that 
may affect stimulus conditioning. The behav- 
ioral literature is replete with examples (see 
Kelleher & Gollub, 1962; Williams, 2002) in 
which stimuli have been conditioned through 
pairing with either unconditioned or condi- 
tioned reinforcers and, thus established, have 
served to increase arbitrary nonvocal respond- 
ing. In the case of automatically reinforced 
vocal responses, this conditioning process 
requires that speech stimuli become reinforcers 
themselves in order to strengthen the responses 
that produce them. With failures to condition 


or with discrepant performances, it may be that 
aspects of the pairing procedure impinge on the 
paired auditory stimulus in some way that 
constrains its sensitivity to the SSP process and 
to the durability of its effects. This suggests that 
procedures employed in SSP studies to date 
may not have been optimally arranged to 
produce conditioning. It is also possible that 
even if conditioning did occur, measurement 
systems may not have been sensitive enough to 
detect effects of SSP. In light of these 
possibilities, the current study was conducted 
to evaluate conditioning effects of SSP through 
various procedural modifications and post-SSP 
phases of socially mediated reinforcement. 

In earlier studies, sessions consisted of a series 
of trials in which a syllable was paired with 
delivery of a putative reinforcer (e.g., ba ba ba 
plus candy). To the degree that these stimuli were 
salient, ability of the speech stimulus to acquire 
reinforcing properties through pairing would be 
more or less strong. Evidence from basic 
experimental research (see Dinsmoor, 1995a, 
1995b) has shown that the effects of pairing can 
be enhanced by interspersing a stimulus that is 
not followed by a reinforcing stimulus (i.e., 
unpaired comparison S— ) with trials in which a 
different stimulus (S+) is followed immediately 
by such an event. In the case of vocal responses, 
this advantage would arise for a self-produced 
auditory stimulus that resembles one with a 
pairing history over that of a nonpaired response 
product. In the current study, interspersal of S — 
trials with S+ trials was designed to maximize 
these pairing effects and provided additional 
benefit by controlling for elicitation effects of a 
nontarget auditory stimulus. 2 

Another change to the SSP procedure 
involved the addition of an observing prompt 

2 This use of the terms S+ and S — is unconventional in 
the sense that neither stimulus has discriminative 
properties (see Dinsmoor, 1995a, 1995b). These designa- 
tions are used here to specify the degree of their 
conditioning potential in order to provide a contrast for 
the likelihood of either stimulus acquiring such discrim- 
inative properties. 
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(Dinsmoor, 1995b) prior to initiation of any 
trial. The purpose of this prompt (e.g., look) was 
to increase the likelihood that the succeeding 
auditory stimuli would be more salient as a 
result of the child’s attending response imme- 
diately preceding SSP presentation. This 
prompt preceded all trials because it was 
important for S — and S+ to be equally 
observable. Next, experimenters used exagger- 
ated prosodic patterns ( motherese ; Falk, 2004) 
when presenting stimuli of both S+ and S— to 
increase the likelihood that speech stimuli 
during sessions were different from nonrelevant 
speech stimuli occurring between formal ses- 
sions (e.g., incidental conversation during 
session breaks). Finally, varied intertrial inter- 
vals (ITIs; Catania, 1998; Gibbon & Balsam, 
1981) were used during baseline and SSP to 
reduce fixed-time passage as a confounding 
effect to SSP effects in that target responses 
could more reliably be attributed to the 
conditioning procedure by eliminating tempo- 
ral predictability while the relevant paired 
relation was held constant. 

Further, this study assessed the effects of 
specific reinforcement of responses that pro- 
duced a previously paired speech syllable. There 
is conceptual support for such an effect 
(Michael, 2004; Skinner, 1957, 1969), and of 
course, the clinical value for communication- 
impaired individuals is strong (e.g., Hall & 
Sundberg, 1987; Shafer, 1994). 

METHOD 

Participants and Setting 

Three children with severely delayed speech- 
language skills who had been diagnosed with 
autism participated in the study. Joshua, 2 years 4 
months old, had just been enrolled in school but 
classes had not yet begun. Madison, 2 years 8 
months old, had been attending a behavior- 
analytic preschool classroom for 1 month at the 
time of her participation. Daniel, 5 years 7 
months old, had previously received 1.5 years of 
school-based behavior-analytic instruction. None 


of the children displayed problem behavior (e.g., 
aggression, self-injury) or sensory loss (i.e., 
deafness, dysarthria), and all were from homes 
in which English was the primary language. 

Sessions were conducted at each child’s 
school 3 to 5 days per week and varied in 
duration from 5 to 15 min, depending on the 
condition. Sessions were typically conducted in 
a contiguous manner (several in a row) with 
brief play periods in between, and none were 
conducted immediately after lunch, recess, or 
playtime in order to maximize reinforcer value. 
Session rooms were carpeted and equipped with 
a small table, chairs, a video camera on a tripod, 
and low-preference toys (as identified by 
caregivers). Low-preference items remained 
present during sessions as part of the room’s 
equipment. Interaction with these items was 
allowed, but it rarely occurred when higher 
quality reinforcers were available (during ses- 
sions). All sessions were videotaped. Reinforcers 
(edible items and highly preferred toys) were 
kept in closed, opaque containers and were not 
available to the child except during appropriate 
session conditions. 

Preexperimental Assessments 

A speech pathologist administered standard- 
ized and criterion-referenced speech-language 
assessments prior to the study, except when 
parents completed informant reports. The 
Kaufman Speech Praxis Test (KSPT; Kaufman, 
1995) was given to determine participants’ 
existing echoic repertoires with single phonemes 
(e.g., I ml) and more complex syllabic (e.g., Imal) 
and multisyllabic (e.g., Imamal) constructions. 
Of 24 available echoic models, Joshua, Madi- 
son, and Daniel echoed 0, 3, and 2, respec- 
tively. None met basal level responding on the 
Peabody Picture Vocabulary Test-Ill (Dunn, 
Dunn, & Dunn, 1997), a measure of receptive 
picture identification. Thus an alternate inven- 
tory, the Receptive-Expressive Emergent Lan- 
guage Test (3rd ed.) (Bzoch, League, & Brown, 
2003), was used to provide a reference point 
for comparing derived expressive and receptive 
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language ages of participants. None of the 
participants scored over 12 months of age on 
this dichotomous (yes-no) measure of receptive 
and expressive language performance when 
observed in noninstructional settings. Additional 
speech-language information was obtained from 
informant observations of participants’ verbal 
repertoires, including echoic function, using the 
Behavioral Language Assessment (BLA; Sundberg 
& Partington, 1998). Joshua’s BLA indicated 
low-frequency vocal play, no echoic responses, 
and no other verbal operants. Madison’s BLA 
showed frequent vocal play but no verbal 
functions. Daniel’s BLA reported frequent vocal 
play and a few responses under mand, tact, or 
intraverbal control. He occasionally echoed 
simple sounds and words from the environment 
(e.g., television) but did not respond reliably to 
echoic prompts. 

The topography and frequency of phonemes 
in the participants’ vocal play repertoires were 
inventoried in a 30-min free-play observation 
period conducted during the week prior to the 
study. During this observation, Joshua emitted 
12 (of 42 charted) English phonemes. Norma- 
tive comparisons of spontaneous speech in 
typically developing peers have not been 
reported (see Smit, 1986), but phonetic 
transcription of speech samples from 520 
typically developing children in California 
showed that by the age of 3 years, children 
could accurately emit all phonemes except III 
and Irl (Porter & Hodson, 2001), albeit under 
tact or echoic conditions. The 12 phonemes 
Joshua emitted were distributed over less than 
half (37%) of the 30-s recording intervals. By 
contrast, Madison emitted 29 phonemes across 
93% of the intervals. Daniel demonstrated 
infrequent vocal behavior during free play, 
despite incipient verbal skills such as echoic, 
mand, and tact relations. He emitted 21 
phonemes, but these were distributed over only 
32% of intervals, a pattern similar to Joshua’s 
(whose verbal repertoire, in contrast to Daniel’s, 
contained no functional operants). 


Stimulus Preference Assessment 

Prior to the study, parents or teachers 
completed a preference assessment survey 
(Fisher, Piazza, Bowman, & Amari, 1996) that 
yielded a ranked list of each child’s preferred 
edible items and toys. These items were further 
assessed (Carr, Nicolson, & Higbee, 2000) to 
verify preference ranking. The three highest 
ranked items in each assessment were selected 
for use as putative reinforcers during the study. 
To address the possibility of day-to-day changes 
in relative preference for various stimuli, the 
first daily session began with the experimenter 
presenting an array of items previously identi- 
fied as preferred. Any items not touched, 
reached for, or accompanied by smiles when 
presented during a 1-min presession sampling 
period were eliminated, and remaining items 
were rotated randomly during that day’s 
sessions. For simplicity of description, items 
identified through these assessments as preferred 
henceforth will be referred to as reinforcers. 

Dependent Variables and Data Collection 

Targets were selected from vocalizations 
made during the free-play observation period 
and consisted of combinations of phonemes 
that were not under echoic control (as evaluated 
by KSPT and BLA), yet occurred in 10% to 
25% of 30-s intervals during the phoneme- 
inventory observation. This selection procedure 
ensured that topographies could be emitted by 
participants yet were under weak evocative 
control of an echoic stimulus. In cases in which 
fewer than 1 0% of intervals contained potential 
targets or phoneme units, targets were based on 
available (emitted) topographies. Targets (S+ 
responses) were beh and oo for Joshua, aypayk 
and sbeba for Madison, and reeklo and tebba for 
Daniel. Interspersed S— responses were sio and 
ee for Joshua, oro and yoit for Madison, and 
boosie and ammi for Daniel. 

Target responding was defined as any vocal 
response that matched or was similar to the 
paired training stimulus (S+). Similarity was 
defined as acoustic or phonologic approxima- 
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tion to a particular phoneme (e.g., same 
articulatory feature plus proximate placement). 
A nontarget response was defined as any response 
that was the same or similar to the unpaired 
comparison stimulus (S— ). Vocalizations that 
did not meet the definition of target or 
nontarget responses were not counted. Non- 
speech vocalizations (e.g., laughing, coughing) 
also were excluded. Any vocal response separat- 
ed by a 1-s silent interval from any other vocal 
response was counted as one response. 

Previous studies typically evaluated SSP 
effects on responding that occurred immediate- 
ly preceding and following pairings (i.e., 
between sessions). However, these observation 
periods often yielded data that indicated weak, 
temporary, or absent effects of the independent 
variable. The current study was originally 
designed to similarly evaluate responding, but 
when pre- and postpairing observations failed to 
demonstrate SSP effects with the 1st participant 
(Joshua), and, concomitantly, experimenters 
observed responding during pairing sessions, 
within-session data were examined to capture 
more accurately the effects of SSP. Further- 
more, analysis of within-session data allowed 
appropriate data comparison between condi- 
tions (SSP, programmed reinforcement) in 
which relevant stimuli were present, in contrast 
to postsession periods in which vocalizations 
produced the auditory stimulus but the paired 
reinforcer was absent. 

Each occurrence of target and nontarget vocal 
responses was recorded when emitted during 
varied ITIs of baseline and SSP conditions. 
During the programmed reinforcement and 
withdrawal conditions, each instance of the target 
response was recorded during the 5-min session. 
Responses during programmed reinforcement 
were not counted if they occurred within 5 s of 
the adult vocal stimulus (see below). 

Interobserver Agreement 

Two independent observers manually record- 
ed session data during randomly selected 
sessions (balanced across conditions) either 


during the session or from video recordings. 
Interobserver agreement on frequency of target 
and nontarget vocalizations was calculated 
separately by dividing the smaller frequency of 
vocalizations per session by the larger frequency 
per session and converting this ratio to a 
percentage. Target and nontarget interobserver 
agreement was assessed for sessions in which the 
relevant data were collected (target: baseline, 
SSP, programmed reinforcement, withdrawal; 
nontarget: baseline, SSP). For Daniel, all data 
were excluded for one session during which, 
when the school’s public address system came 
on unexpectedly, he ran from the training area. 
No data were excluded from any other sessions. 

For Joshua, Topography 1 ( beh ) agreement 
was calculated for 81% of sessions; mean 
interobserver agreement was 89% (range, 56% 
to 100%). Nontarget agreement was calculated 
for 86% of sessions and was 100%. On 
Topography 2 ( oo ), agreement was assessed in 
73% of sessions, and mean agreement was 92% 
(range, 80% to 100%). Mean nontarget 
agreement for Topography 2, calculated in 
63% of sessions, was 83% (range, 0% to 
100%). For Madison, Topography 1 ( aypayk ) 
agreement was assessed across 35% of sessions, 
and mean agreement was 93% (range, 75% to 
100%). Nontarget agreement was assessed in 
40% of sessions, and mean agreement was 87% 
(range, 33% to 100%). Topography 2 ( sheba ) 
agreement for Madison was assessed in 35% of 
sessions, and mean agreement was 98% (range, 
91% to 100%). Nontarget agreement was 
calculated in 38% of sessions and was 100%. 
For Daniel, Topography 1 ( reeklo ) agreement 
was calculated for 29% of sessions, and mean 
agreement was 96% (range, 85% to 100%). 
Nontarget agreement was assessed in 29% of 
sessions and was 100%. For Topography 2 
(: tebba ), agreement was calculated for 38% of 
sessions, and mean agreement was 99.6% 
(range, 94% to 100%). Nontarget agreement 
was assessed in 33% of sessions, and mean 
agreement was 98% (range, 83% to 100%). 
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Procedure 

Baseline. Each baseline session was approxi- 
mately 12 to 15 min in duration and consisted 
of 10 trials each of S+ and S— (20 trials total), 
randomly arranged but with no more than two 
consecutive trials of either to reduce stimulus 
predictability (Catania, 1998). In S+ trials, the 
experimenter presented the target stimulus (e.g., 
said, “beh”) without its paired stimulus. S — 
trials had no paired stimulus; thus, baseline S+ 
and S— trials were procedurally identical. Trials 
with Joshua were preceded with a prompt to 
attend (e.g., “look”). However, Madison often 
emitted the orienting prompt as part of the 
target response (e.g., “look! sbeba”). Therefore, 
the auditory stimulus ( look ) was replaced with a 
clicker noise, effectively eliminating its inclu- 
sion in subsequent responses and possibly 
further distinguishing the target syllable as the 
more relevant stimulus from the orienting 
prompt. The clicker procedure was continued 
for the remaining participant (Daniel). Audito- 
ry stimuli were presented at the rate of one per 
second for 3 s (e.g., S+ ba ba ba\ S~ dee dee 
dee). To decrease temporal predictability, trials 
were separated by an ITI that varied between 
5 s and 30 s. Between sessions, participants had 
access (in the session setting) to free play with 
low-preference toys. Interaction between par- 
ticipant and experimenter occurred only to the 
extent that it was necessary to ensure safety. 

Stimulus-stimulus pairings. Pairing sessions 
were similar to those in baseline, with two 
exceptions. First, presentation of S+ trials 
included immediate delivery of a reinforcer 
following the vocal stimulus. During S+ trials, 
the child had access to the reinforcer for 10 s or, 
in the case of edible items, until consumed. 
Second, during S+ trials a 20-s correction delay 
was implemented if the participant emitted the 
target response between the experimenter’s 
vocalization and delivery of the reinforcer. This 
delay controlled, to the extent possible, for 
adventitious reinforcement of responses through 
socially mediated contingencies. During this 


period, the experimenter did not look at or 
interact with the participant, and at the end of 
the delay a new trial began. If a reinforcer was 
already partially delivered, reinforcer delivery 
was completed but the response was not 
counted. Otherwise, reinforcer delivery was 
withheld and any responses during the delay 
were not counted. This procedure was designed 
to yield data reflective of SSP effects alone, 
insofar as possible. If the child emitted any 
other vocal response in the period between 
stimulus presentation and reinforcer delivery, 
no correction delay was imposed because target 
responding was the variable of experimental 
interest. Moreover, reinforcers only followed S+ 
stimuli, and the likelihood of a nontarget 
response occurring between S+ and reinforcer 
delivery was low. Furthermore, even if nontar- 
get responses (or any other vocalizations) were 
emitted between S+ and reinforcer delivery, 
thus undergoing adventitious reinforcement, 
any subsequent target (over nontarget) rate 
increase would further support SSP effects on 
the dependent variable. 

Programmed reinforcement. The purpose of the 
programmed reinforcement condition was to 
further strengthen SSP-induced target responses. 
During each 5-min session, the experimenter 
delivered a reinforcer within 5 s of a target 
vocalization. To maximize the likelihood of 
obtaining a reinforceable response, each session 
was preceded by SSP trials (S+ only) at the rate of 
one syllable per second (in triads) every 5 to 10 s 
until the child emitted a target response. These 
preliminary pairings were omitted if, when the 
session began, the child immediately emitted the 
target response. Data collection for these sessions 
began after the first target response and contin- 
ued for 5 min. During this period, if 1 min 
elapsed with no target responding, another S+ 
pairing was provided. 

Noncontingent reinforcement. The comparative 
effects of SSP and programmed reinforcement on 
target responding were further evaluated by 
conducting sessions in which noncontingent 
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reinforcers were delivered for 5 min on a fixed- 
time (FT) 30-s schedule. If a target response 
occurred within 3 s of scheduled reinforcer 
delivery on the FT schedule, a 20-s correction 
delay was imposed. To the extent that responding 
occurred in the absence of programmed rein- 
forcement during this condition, conclusions 
could be made regarding the source of reinforce- 
ment for post-SSP responding and the possible 
separate influence of automatic reinforcement on 
response maintenance. 

Caregiver training. Caregivers for all partic- 
ipants received brief instruction in SSP and 
programmed reinforcement as part of the exit 
interview. This training was conducted to 
increase the likelihood that new vocal responses 
would be evoked and effectively maintained in 
the child’s natural environment. 

Experimental Design 

A nonconcurrent multiple baseline design 
(Watson & Workman, 1981) across phoneme 
topographies was combined with a reversal 
design to evaluate effects of the SSP procedure 
and subsequent programmed reinforcement on 
the frequency of within-session vocalizations. 

Treatment Integrity 

A trained observer assessed treatment integ- 
rity from observations, balanced across condi- 
tions, during sessions or from videotapes. Trials 
were scored as completely correct or incorrect. 
The number of correct trials was divided by the 
number of correct plus incorrect trials, and this 
ratio was converted to a percentage to yield a 
mean integrity score across sessions. Baseline 
trials were correct if (a) the orientation prompt 
preceded each presentation of S+ or S — , (b) 
three scheduled syllables (either S+ or S— ) were 
given within 5 s, and (c) no reinforcers followed 
S+ or S— presentations. An SSP trial was 
correct if (a) an attending prompt preceded 
each presentation of S+ or S — , (b) three 
scheduled syllables (either S+ or S— ) were 
given within 5 s, (c) reinforcers followed S+ 
presentations within 5 s and no reinforcers 


followed S— presentations, (d) an ITI of 5 to 
30 s occurred, and (e) a correction interval of 
20 s followed any target response that occurred 
between presentations of S+ and reinforcer. 
Programmed reinforcement trials were correct if 
a reinforcer was presented within 5 s after the 
child vocalized any target syllable that was not 
preceded within 3 s by an echoic prompt from 
the experimenter. Withdrawal trials were cor- 
rect if reinforcers were delivered on an FT 30-s 
schedule and a delay interval of 20 s occurred 
after any target behavior emitted within 5 s 
prior to scheduled reinforcer delivery. For 
Joshua, the mean integrity score was 99% for 
both evaluations across more than 50% of 
scored sessions. For Madison, the mean integ- 
rity score was at least 99% for both evaluations, 
calculated for 35% of sessions. Daniel’s mean 
integrity score was 100% for both evaluations 
and was calculated for at least 29% of sessions. 

RESULTS 

Results for Joshua are shown in Figure 1. 
The upper panel shows that rate of Topography 
1 ( [beh ) responding increased slowly during SSP 
over near-zero baseline rates to over 4 responses 
per minute, with interim responding stable but 
varied (range, 0 to 2.8 responses per minute) 
(Figure 1, top). When programmed reinforce- 
ment was implemented, responding initially 
decreased. Flowever, responding during this 
condition recovered to SSP levels, indicating a 
learning effect, and continued in an upward 
trend to over 7 responses per minute. When 
reinforcers were available noncontingently, 
target responding immediately decreased, but 
then was maintained at rates between 3 and 6 
responses per minute. Joshua did not emit any 
nontarget responses during the experiment for 
Topography 1, providing further support for 
positive SSP effects. Similar effects were 
observed in Joshua’s second evaluation. Topog- 
raphy 2 ( oo ) responding (Figure 1, bottom) 
occurred at a mean rate of 0.6 responses per 
minute (range, 0 to 1.6) throughout baseline 
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Figure 1. Rate of target and nontarget responses per session (20-trial blocks) for Joshua during within-session 
observations in baseline and stimulus-stimulus pairing (SSP) conditions and rate of target responses per 5-min session of 
programmed and noncontingent reinforcement. 


and the first 20 SSP sessions. Over the next 19 
SSP sessions, however, the mean rate increased 
to 2.2 (range, 0 to 7.8). When programmed 
reinforcement was implemented, responding 
was maintained at or above SSP levels, with a 
mean rate of 2.9 (range, 1 to 5). Responding 
was maintained during noncontingent rein- 
forcement, after an initial decrease to zero 
following programmed reinforcement, with a 
mean rate of 1.8 (range, 0.4 to 2.6). 

Results for Madison are shown in Figure 2. 
The upper panel shows Madison’s vocalizations 
of Topography 1 ( aypayk ). Nontarget vocaliza- 
tions were not emitted throughout the experi- 
ment. Baseline target responses were minimal, 
with a mean of 0.06 responses per minute 
(range, 0 to 0.5). Target responding steadily 
increased during SSP, showing modest differ- 


ential pairing effects (levels did not exceed 2 
responses per minute). When programmed 
reinforcement for target responses was instated, 
rate of vocalizations immediately increased to 
nearly 9 per minute, stabilizing at about 6. 
Target overall mean rate during programmed 
reinforcement was 6.4 (range, 5.4 to 8.8). The 
immediacy of change in response level during 
this condition suggests strong differential influ- 
ence of social contingencies. During noncon- 
tingent reinforcement, responding decreased to 
near zero (0.6 rpm) within three sessions. After 
reinstating programmed reinforcement, re- 
sponding was reestablished at previously high 
levels, with a mean rate of response of 5.1 
(range, 0.5 to 7). In the second evaluation 
(Figure 2, bottom), baseline target ( sbeba ) and 
nontarget topographies were emitted at low 
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Baseline Stimulus- Programmed NCR Programmed 



Figure 2. Rate of target and nontarget responses per session (20-trial blocks) for Madison during within-session 
observations in baseline and stimulus— stimulus pairing (SSP) conditions and rate of target responses per 5-min session of 
programmed and noncontingent reinforcement. 


rates (except in Session 4 when both responses 
occurred approximately five times per minute). 
In SSP, there was a differential increase in target 
over nontarget vocalizations (which remained 
near zero throughout the SSP condition). 
Despite steady increases, the overall SSP mean 
target rate was 1 response per minute (range, 0 
to 2.3) compared to a baseline rate of 0.6 
(range, 0 to 4.8). During programmed rein- 
forcement, Topography 2 responding was 
variable, with an upward trend over SSP levels 
and a mean target rate of 4.1 (range, 0.6 to 9.4). 
When reinforcement was available noncontin- 
gently, response level immediately decreased, 
dropping to zero by the third session and 
continuing at or near zero throughout. Mean 
target responding during the withdrawal con- 


dition was 0.7 (range, 0 to 3). To recover the 
target vocalization, programmed reinforcement 
was reinstated briefly. Responding immediately 
increased to 3.4 responses per minute and was 
maintained near this level (mean rate, 3.3; 
range, 2.8 to 3.6). 

Daniel’s results are shown in Figure 3. 
During the evaluation of Topography 1 ( reeklo , 
top), no baseline responding occurred for either 
topography. During the first half of SSP, the 
differential effect of pairing on targets was 
evident, with a mean rate of 2.7 (range, 0.3 to 
4.7) compared to the nontarget mean rate of 
0.5 (range, 0 to 1.1). However, during the 
second half of SSP, both target and nontarget 
vocalizations decreased to less than 0.5 respons- 
es per minute, with one exception (Session 24). 
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Figure 3. Rate of target and nontarget responses per session (20-trial blocks) for Daniel during within-session 
observations in baseline and stimulus— stimulus pairing conditions and rate of target responses per 3 -min session of 
programmed and noncontingent reinforcement. 


When programmed reinforcement was initiat- 
ed, levels of target responding showed an 
immediate increase (5.8) then continued above 
2.5 (mean rate, 4). Under noncontingent 
reinforcement, target responding decreased to 
0.6 rpm. Programmed reinforcement was rein- 
stated and target responding immediately 
increased to 5.9, higher than response levels 
during the first programmed condition. The 
mean target frequency during the second 
programmed reinforcement condition was 6.6 
(range, 5.4 to 7.6). 

Figure 3 (bottom) shows Daniel’s vocaliza- 
tions during Topography 2 ( tebba ) training. 
During baseline, after an initial period when 
both target and nontarget topographies were 
emitted at fairly high rates (approximately 


10 rpm during Session 1), vocalizations varied 
in a downward trend between 0 and approxi- 
mately 4 responses per minute, and then 
stabilized at zero. Target responding remained 
below 1 for the first four sessions of SSP but 
then increased to approximately 3. At this 
point, given the prior results with Topography 
1, programmed reinforcement was initiated to 
avoid response loss and maximize clinical 
benefit by strengthening this topography. The 
target behavior immediately responded to the 
contingency with an initial frequency of 15 
responses per minute, increasing to 19 by the 
third session of this condition. The response 
level decreased to zero when reinforcement was 
available noncontingently, and reinstatement of 
the programmed contingency again resulted in 
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immediate increases in responding to 5-9, 
increasing to a mean overall rate of 8.2. 

DISCUSSION 

This study investigated the effects of an 
enhanced stimulus-stimulus pairing procedure 
on vocal responses of children with autism and 
the subsequent effects of programmed rein- 
forcement on pairing-induced speech responses. 
Results showed that target vocalizations in- 
creased during SSP to acceptable but moderate 
levels over baseline and that these increases 
occurred in the absence of programmed 
reinforcement. Moreover, in all cases, topogra- 
phies were further strengthened at or above SSP 
levels through subsequent programmed rein- 
forcement. However, when reinforcement was 
available noncontingently, only the participant 
with the lowest preexperimental vocal repertoire 
(Joshua) demonstrated target maintenance. 
Thus, the current study offers only modest 
evidence that SSP-induced speech is strength- 
ened by its absolute conditioned reinforcing 
value, particularly in speech-delayed children 
with existing moderate vocal play. Rather, social 
contingencies of reinforcement may play a 
greater role in strengthening incipient vocaliza- 
tions for these and other early speech learners. 

The clinical utility of SSP may have varied 
for individual participants. Except for Daniel’s 
Topography 1 ( reeklo ), all targets occurred at 
least once during baseline. It is possible that 
vocalizations could have been increased without 
SSP but, instead, through programmed rein- 
forcement alone. However, this study was not 
designed to evaluate the separate effects of this 
variable but rather to employ more optimal SSP 
procedures to augment low-frequency vocaliza- 
tions in children who were not readily respon- 
sive to verbal operant contingencies. This 
potential clinical benefit of SSP is most evident 
for Daniel, whose target response ( reeklo ) did 
not occur at all during baseline but occurred 
during SSP and was strengthened differentially 
over a nontarget response. To a lesser extent, 


this may also be true for Madison’s target 
aypayk and Joshua’s target beh\ both were 
observed during baseline, but occurred only 
weakly compared to the more robust and 
differential performance during SSP. 

The clinical value of SSP is to establish or 
increase incipient vocal responses not currently 
under operant control so that speech training can 
proceed by arranging social contingencies (e.g., 
mand) that will strengthen and maintain func- 
tional language. With all 3 participants in this 
study, preexperimental vocalizations in general 
occurred at low frequencies, and previous efforts 
to establish echoic or mand control over the 
existing vocal repertoire had largely failed, thus 
underscoring SSP as a viable alternative for initial 
speech training. However, in this study, as in 
several previous SSP investigations, performances 
varied across participants, and this may be due to 
multiple factors. 

Differential increases in target over nontarget 
responding during SSP suggest that paired 
speech stimuli became conditioned such that 
similar response-produced stimuli functioned to 
strengthen those responses. The strongest 
evidence for this is Joshua’s response mainte- 
nance during noncontingent reinforcement 
(NCR) when socially mediated reinforcement 
was not contingent on target responding and, 
thus, continued vocal behavior may be attrib- 
utable, at least in part, to automatic reinforce- 
ment contingencies. However, the brevity of 
this phase makes it difficult to rule out control 
by social contingencies and, particularly for 
Madison and Daniel, response decreases during 
NCR may simply point to the relative strength 
of these contingencies on vocal behavior. 

Daniel’s data in the first SSP evaluation (see 
Figure 3, top) merit further analysis. Early 
target responding suggests that the S+ reeklo 
was initially conditioned to some degree, only 
to lose its value in the latter half of the phase. 
Competing contingencies may have influenced 
response strength. Daniel had a stronger history 
of reinforcement for verbal behavior than 
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Joshua and Madison did. Thus, responding that 
occurred possibly under partial influence of this 
history may have decreased, because these 
contingencies were not currently active and, 
concomitantly, an omission contingency (the 
correction delay) was in place. In the current 
study, as in Miguel et al. (2002), a 20-s delay 
was imposed during SSP if target responding 
occurred between presentations of the two 
stimuli being paired. This procedure was an 
attempt to control for adventitious reinforce- 
ment of responses under socially mediated 
contingencies (e.g., echoic), but, it is possible 
that this correction delay functioned to suppress 
responding overall. This would explain the 
decreasing trend during SSP in target vocaliza- 
tions for Daniel and, perhaps, the sometimes 
variable and sluggish responding by the other 2 
participants. 

In addition to the SSP correction, Miguel et 
al. (2002) also imposed a correction delay after 
responding that occurred during the ITI. The 
purpose was to control for selection of these 
responses through social contingencies available 
in the next scheduled SSP (e.g., tangible items). 
No such delay was employed in the current 
study because suppression of responding during 
the variable ITI was undesirable because this 
interval yielded the actual within-session exper- 
imental data (unlike Miguel et al., in which data 
were collected during pre- and postsession 
observations). Because no ITI delay was used, 
it is possible that any responses, target and 
nontarget alike, that occurred contiguously with 
the next pairing were susceptible to adventitious 
reinforcement. Certainly, both responses had 
equal opportunity to occur based solely on 
occurrence of programmed auditory stimuli 
(i.e., S+ and S— syllables). However, to the 
extent that SSP increased target responding 
differentially, more target responses would be 
available for incidental reinforcement. Howev- 
er, other factors make it difficult to attribute 
increased target responding during SSP to the 
absence of an ITI correction. First, the S — 


control (not previously used in SSP studies) 
ensured that nonpairing trials were as likely to 
follow the ITI as were trials of SSP, thus 
eliminating availability of reinforcing stimuli 
during these periods. In addition, reinforcers in 
SSP occurred after the variable ITI, regardless of 
response occurrence during ITI, thus further 
disrupting any contingent relation between ITI 
response and SSP reinforcer. 

Several SSP procedural modifications were 
made, although not separately evaluated, to 
overcome the transience and variability of the 
procedure’s typical effects under the assumption 
that strong stimulus salience during pairing is 
likely to produce robust conditioning. An 
orienting prompt that initiated all baseline 
and SSP trials may have increased the impact 
of ensuing stimuli, contributing to differential 
effects observed in SSP. However, the role of 
orienting responses to the prompt was not 
evaluated. To differentially establish paired 
stimuli as conditioned reinforcers for responses 
that produced these stimuli, nonpaired stimuli 
(S — ) were randomly, but equally, interspersed 
during baseline and SSP sessions. This modifi- 
cation may have facilitated salience of the paired 
stimulus by changing the ratio of reinforcer 
probability (Gibbon & Balsam, 1981) and thus 
increased the likelihood that important stimuli 
would be more potent (Dinsmoor, 1995a) and 
that participants would disregard, relative to S+, 
the unimportant stimuli (i.e., those that were 
not followed by reinforcers). (The term rein- 
forcer, in this instance, refers to the paired 
stimulus and is not used in the typical sense, 
because no response is required for this stimulus 
to be delivered.) The added experimental 
control offered by the S— feature supports the 
interpretation of a differential conditioning 
effect of SSP on paired stimuli, because 
nontarget responses that produced unpaired 
stimuli failed to increase commensurately with 
target responses. It should be noted, however, 
that, initially, S+ and S— might have been 
equally salient for all participants (see baseline 
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for Daniel’s, Joshua’s, and Madison’s Topogra- 
phy 2). Finally, an ITI varying in duration from 
5 s to 30 s was used to highlight the unpre- 
dictability of trial stimuli and thus to increase 
attending responses to these stimuli (Gibbon & 
Balsam). In addition to earlier comments 
regarding the ITI, it should be noted that 
shorter (variable) durations may have been 
insufficient to make stimuli optimally salient 
and thus discriminable. For instance, with 
certain edible reinforcers, it is possible that 
even with maximum interval values, food taste 
dissipated slowly and remained available as a 
reinforcing stimulus during nonpaired trials, 
making the difference between S+ and S— less 
observable. 

The study also may have been limited by the 
method used to calculate interobserver agree- 
ment. Although observations of target and 
nontarget responses were calculated separately, 
comparing all responses in briefer time bins 
(e.g., 10 s) would increase confidence in 

response occurrence. Finally, there was a 
practical challenge in delivering some types of 
reinforcers (e.g., bubbles) between one response 
and the next, which may have maintained a 
triadic response pattern (e.g., beh beb beh) that 
children often emitted. Although there is little 
reason to expect an altered topography (i.e., a 
single response) because the paired stimulus was 
triadic, the triadic response form often persisted 
during programmed reinforcement even when 
reinforcer delivery followed a single response 
(e.g., beh). Such idiosyncratic responding em- 
phasizes the need, in applied settings, to manage 
the influence of stimuli that are prominent, yet 
irrelevant, to the learning task. 

In SSP research, two issues seem most 
important to investigate. First, reliable effects 
have been elusive. The transience of speech as 
an environmental stimulus may render it too 
obscure to become easily conditioned as a 
reinforcer for individuals who unreliably re- 
spond to stimulus changes in general. Thus, 
future researchers might focus on evaluating 


SSP procedural components designed to in- 
crease stimulus salience to determine which are 
necessary and sufficient to produce the most 
reliable effects. Modifications such as those used 
collectively in the current study are examples of 
components whose contributions could be 
separately evaluated, one of the most important 
being the S— feature, to inform treatment 
effects on both target and nontarget responses. 
More frequent preference sampling could also 
be employed to inform shifts in motivating 
operations (Laraway, Snycerski, Michael, & 
Poling, 2003) that affect the value of stimuli 
used as reinforcers in the pairing process. 

The other important concern is isolation of 
the role of automatic reinforcement in produc- 
ing SSP effects. Controls typically used to 
demonstrate a programmed reinforcement ef- 
fect (Thompson, Iwata, Flanley, Dozier, & 
Samaha, 2003) would be appropriate to 
evaluate the effect of automatically produced 
stimuli on target behaviors. In particular, 
response maintenance could be evaluated dur- 
ing extinction, which, in the case of SSP studies, 
would require the vocal response to occur 
without being followed by its putative reinforc- 
er (the auditory stimulus). Masking the stimu- 
lus to prevent auditory detection would be one 
way to accomplish this. A more rigorous control 
would be achieved through NCR, which allows 
the necessary elements of both the contingent 
relation and the presence of the auditory 
stimulus to be evaluated. It should be noted 
that the NCR condition in the current study 
controlled more directly for the effects of 
programmed (vs. automatic) reinforcement on 
the target response because the reinforcer 
delivered during NCR was not the auditory 
stimulus conditioned during SSP; rather, it was 
the item delivered in the programmed rein- 
forcement condition. Therefore, any response 
decrease from the programmed reinforcement 
condition to NCR would suggest influence by 
programmed contingencies; this, indeed, was 
the case with Daniel and Madison. However, 
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Joshua’s target response maintenance during 
NCR suggested indirectly that the auditory 
stimulus retained some conditioned value as an 
automatic reinforcer. To evaluate more directly 
the effects of automatic reinforcement using an 
NCR control, a post-SSP condition might 
provide the target syllable played periodically 
through speakers, thus eliminating the contin- 
gency between the target response and the 
reinforcing stimulus. Hence, any subsequent 
response decrease might be attributable to the 
noncontingent availability of the stimulus that, 
indeed, functioned as a reinforcer for the 
response. Researchers might also evaluate the 
strength of previously paired auditory stimuli to 
reinforce nonvocal arbitrary responses such as a 
button press (see Esch et al., 2005), although it 
could be argued that this type of procedure 
involves contingencies that are more appropri- 
ately termed programmed than automatic. 

Evaluation of the role of conditioned auto- 
matic reinforcement in SSP will require future 
researchers to consider the separate functions 
that result in novel response generation (the first 
response), response strengthening, and mainte- 
nance of targeted topographies. Some have 
suggested that the first response in SSP may be a 
type of echoic reflex (Tonneau, 2005) potenti- 
ated by SSP, with subsequent vocalizations 
maintained by the conditioned reinforcing 
stimuli they produce. This suggests that 
differential effects of SSP may be dependent 
on some level of an echoic or perhaps another 
facilitative duplic (i.e., nonvocal imitation) 
repertoire. However, if auditory speech stimuli 
could elicit a type of echoic reflex, it is unclear 
why vocalizations in general would be resistant 
to echoic operant contingencies, as is the case 
with many early speech learners with autism. 

SSP research to date has often shown transient 
conditioning effects. However, to the extent that 
automatic reinforcement plays a role in strength- 
ening SSP responding, temporary effects are not 
unexpected, nor are they undesirable. The goal of 
SSP is to generate sufficient vocal behavior such 


that it can come under the control of naturally 
occurring contingencies that are adaptive for the 
speaker. Some of these are social contingencies 
(Skinner, 1957), but certain contingencies of 
automatic reinforcement (such as achieving parity 
with the practices of a verbal community; Palmer, 
1996) may be equally requisite for development of 
a complex repertoire. Therefore, practitioners 
should anticipate transient SSP effects and 
proactively engineer social contingencies to pro- 
mote long-term maintenance of new responses. 

Finally, failures or equivocal SSP findings (e.g., 
Esch et al., 2005; Miguel et al., 2002), further 
illustrated in performance differences in the 
current study, may suggest some yet undefined, 
but requisite, learner repertoire that is optimally 
responsive to the SSP conditioning process. One 
such possibility is the learner’s history of 
responding to environmental stimuli, particularly 
verbal stimuli, whether self-generated or pro- 
duced by others. In the case of automatically 
reinforced verbal behavior, the speaker also 
necessarily functions as his own listener (Palmer, 
1996). Thus, deficits in a listener repertoire 
would preclude benefit from these contingencies. 
In SSP research to date, most participant 
speaker-listener repertoires have been described 
as largely nonfunctional. Therefore, it may be 
necessary not only to evaluate but also to teach 
prerequisite listener skills that support attending 
to verbal stimuli (e.g., orienting, following simple 
instructions). One corollary benefit of such 
training might be increased vocal play whose 
frequency and topographic diversity may be 
supportive to the SSP conditioning process. 
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